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1. Introduction 


It is well known that the future is uncertain. Against this uncertainty, economic agents plan 
their economic activity accordingly. In this planning, producing forecasts of the quantity of in- 
terest is the traditional way of uncovering possible not-yet-realized trajectories. Feedback from 
estimated future dynamics will then influence actual planning and business activities. This is 
true also for private decision-makers, like firms and other types of organizations, but especially 
for public policy-makers since their activities produce effects at the whole country level. 

The increasing availability of data, together with progress in computational techniques, have 
incentivized researchers to construct more sophisticated forecasting models and to increase the 
accuracy of their performances. Nowadays, available forecasting models range from classical 
econometric models, e.g. ARIMA, to non-parametric models, e.g. exponential smoothing, to 
machine-learning, e.g. trees and neural networks. It results in a plethora of single forecast- 
ing models available to both private and public decision-makers. Since the late ’70s, a group 
of academic researchers proposed the idea of competition among different forecasting models 
(Makridakis et al., 1982). It emerged that statistically sophisticated models do not necessarily 
produce more accurate forecasts, whereas combinations of them outperform vis-d-vis single 
models. Moreover, the ranking of forecasting models depends on the accuracy measure being 
as well as on the adopted forecast horizon. The success of the first so-called M-competition (M 
stands to Makridakis) allowed us to carry on the tradition of forecasting competitions (Hynd- 
man, 2020) until today with the recent M4 and M5 competitions (Petropoulos and Makridakis, 
2020; Makridakis et al., 2021). Given a set of time series at different frequencies, several mod- 
els compete to produce the best forecast. Models? performances are then ranked based on some 
accuracy measures. Based on the idea of competition among different forecasting methods, this 
work compares their forecasting performances on a given time horizon. Unlike the tradition of 
Ms competitions, which are based on thousands of time series at different time frequencies, a 
single univariate time series is selected at the monthly frequency. 

The motivation of this choice is to show that, in the simplest exercise of forecasting a single 
time series, the ex-ante choice of the model is likely to be misleading because a model ranking 
exists and it is specific to time (hence, frequency) and of measurement object of the single series. 
Indeed, when a set of forecasting models is available, a semi-automatic algorithm of model 
selection based on some performance measures would be a superior choice for the various 
decision-makers. In the case at hand, the choice of the monthly unemployment rate is dictated 
by the fact that it is the most common measure of the (mis-)functioning of the labour market 
and, as such, is of utmost importance for policymakers. 

Forecasting models are finally ranked based on some accuracy measures. The main findings 
confirm that, given N forecasting models, combination techniques outperform single uncom- 
bined models in terms of accuracy and reduce the risk of adopting a single forecasting model. 
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2. Forecasting Models 


The comparative forecasting exercise presented in this work comprises a set of 23 different 
uncombined and combined models. The selected time series on which all models are trained is 
the deseasoned dynamics of the Italian unemployment rate over the years 2004 — 2019 at the 
monthly frequency freely available from the ISTAT data warehouse (http://dati.istat.it/). The 
observational period is split between the training set, from January 2004 to June 2019, and the 
test set, from July to December 2019. The set of selected forecasting models contains some 
ARIMA-like models, some Exponential Smoothing models, to machine learning models. It 
also contains combinations of them based on some model averaging techniques. For sake of 
brevity, the succinct list is reported in table 1. All the computations are carried out with the 
Statistical software R by using the most recent packages. Model specifications and other details 
can be provided upon request. 


FAMILY Label Model Reference R package 
ARIMA ARIMA Hyndman and Khandakar (2008) forecast 
ARIMA ARFIMA Fractionally-differenced ARIMA Peiris and Perera (1988) forecast 
GARMA Gegenbauer-ARIMA Dissanayake et al. (2016) garma 
SSARIMA State-space ARIMA Svetunkov and Boylan (2020) smooth 
ES Exponential Smoothing Brown (1956) ets 
Exponential HOLT Linear Exponential Smoothing . Holt and Modigliani (1960) forecast 
Smoothing THETA Exponential Smoothing with drift Assimakopoulos and Nikolopoulos (2000) forecast 
CES Complex Exponential Smoothing Svetunkov and Kourentzes (2018) smooth 
GUM State-space Exponential Smoothing Svetunkov and Kourentzes (2018) smooth 
Machine ARML Bagged AR . caretForecast 
Learning BAG Bagged Exponential Smoothing Bergmeir et al. (2016) forecast 
NN Fast-forward Neural Network forecast 
ADAM Augmented Dynamic Adaptive Model Hyndman and Khandakar (2008) smooth 
Hybrid BATS GUM with ARMA errors De Livera et al. (2011) forecast 
ATA Combination of ES and ARIMA Yapar et al. (2017) ATAforecasting 
SPL Cubic Spline Chambers and Hastie (2017) forecast 
COMB1 Combination of ETS,SSARIMA,GUM and CES smooth 
Conibinations COMB2 Combination of ARIMA,ETS,THETA,NN and BATS forecastHybrid 
COMB3 Combination of ARML and SPL with simple weights ForecastCombinations 
COMB4_BG COMB3 with Bates-Granger weights ForecastComb 
COMB4_InwW COMB3 with Inverse Rank approach ForecastComb 
COMB4 Me COMB3 with Dynamic weighting scheme ForecastComb 
COMB5 Combination of all models except COMBs ForecastCombinations 


Table 1: Selection of forecasting models. 


Once all forecasting models have been estimated, it is interesting to compare statistics of 
model fitting in terms of moments of the corresponding error distribution. At this aim, table 2 
below provides rank values (column RANK) for each forecasting model based on a total score 
(SCORE). The latter statistics is computed as the sum of the single scores reported in terms of 
mean (RANK MEAN), standard deviation (RANK-SD), skewness (RANK-SKEWNESS), and 
kurtosis (RANK_KURTOSIS). 
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FAMILY MODEL RANK-MEAN RANK-SD RANK-SKEWNESS RANK-KURTOSIS | SCORE | RANK 
ARIMA 2 14 23 10 49 13 
ARFIMA 20 15 20 7 62 19 
ARMA GARMA 15 9 11 6 41 11 
SSARIMA 21 18 19 18 76 23 
ES 14 8 9 3 34 5 
Exponential HOLT 12 7 10 4 33 4 
Smoothing THETA 22 22 15 16 75 21 
CES 5 20 17 19 61 18 
GUM 23 21 16 15 75 21 
Machine ARML 18 11 1 23 53 14 
Learning BAG 19 12 13 9 53 14 
NN 1 23 5 8 37 8 
ADAM 17 17 22 14 70 20 
Hybrid BATS 13 10 12 5 40 10 
ATA 3 19 14 12 48 12 
SPL 6 6 8 1 21 1 
COMB 1 4 16 21 13 54 16 
Céibinations COMB2 16 13 18 11 58 17 
COMB3 9 3 2 21 35 7 
COMB4 BG 7 1 7 17 32 3 
COMB4_InvW 8 2 4 20 34 5 
COMB4_MED 10 4 3 22 39 9 
COMBS5 11 5 6 2 24 2 


Table 2: Ranking of forecasting models in terms of model fitting. 


What emerges from table 2 is that, in terms of model fitting, the best-performing forecast- 
ing model is SPL followed by COMB5, COMB4 BG, COMB4 _InW, and so on. In detail, the 
error distribution of the NN model is associated with the lowest mean error, COMB4_BG with 
the lowest dispersion. Whereas ARML and SPL are characterized by the lowest skewness and 
kurtosis, respectively. Despite model fitting being an important quality feature of forecasting 
models, it is not the definitive dimension to consider when a decision-maker needs to adopt 
a single forecasting model. As shown in the next section, the accuracy of forecasting perfor- 
mances may deliver different conclusions. 


3. Results 


Figure | shows the forecasts produced by each model on the test set over a time horizon of 
six months. It is possible to observe that ARML model fails in capturing the dynamics of actual 
data despite its model fitting performances being characterized by the lowest skewness. On the 
contrary, the COMB2 forecasts closely mimic the dynamics of the Italian unemployment rate 
despite its model fitting performance are not the best in any moments of the error distribution. 
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Figure 1: Forecasts of Italian unemployment rate. ARIMA models (solid line): ARFIMA, 
ARIMA, GARMA, SSARMA. Combinations (COMB, two-dashed line): COMBI, COMB2, 
COMB3, COMB4_BG, COMB4_InvW, COMB4_MED. Exponentional Smoothing (ES, dotted 
line): CES, ES, GUM, HOLT, THETA. Hybrid models (dot-dashed line): ADAM, ATA, BATS, 
SPL. Machine Learning models (ML, long-dashedline): ARML, BAG, NN. 


These considerations confirm that model fitting, despite being an important aspect to con- 
sider for the selection of forecasting models, does not necessarily ensure that forecast perfor- 
mances are aligned with model fitting performances. Instead, the use of various ensembling 
techniques delivers satisfactory results compared to those of single uncombined models. On 
this point, note also from figure 1 that the actual dynamics of the unemployment rate is con- 
tained within the full set of forecasts. This means that a suitable model combination can be 
obtained by ensembling appropriately some of the models under scrutiny. 

Finally, table 3 provides the values of various accuracy measures used in the various fore- 
casting competitions: ME (mean error), MAE (mean absolute error), MPE (mean percentage 
error), MSE (mean squared error), MAPE (mean absolute percentage error), RMSSE (root mean 
squared scaled error), RAME (relative absolute mean error), RMAE (root mean absolute error) 
and RRMSE (relative root mean squared error). 
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FAMILY MODEL ME MAE MPE MSE MAPE RMSSE RAME RMAE RRMSE | SCORE | RANK 
ARIMA i9 18 19 18 18 18 19 18 18 165 18 
ARFIMA 1415 B 15 l5 15 14 15 15 131 16 
ARIMA GARMA is 10 15 16 10 16 15 10 16 123 14 
SSARIMA io 4o ioo 4 3 1 4 3 24 3 
ES 9. 2 83 I Z TI 9 12 TI 95 i 
oani [BOL 6 i 6 190 1 10 6 11 10 81 9 
s EA 7 3 10 4 3 4 7 3 4 45 5 
CES 4 1 4 2 1 2 4 1 2 21 2 
GUM 3 2 3 1 2 1 3 2 1 18 1 
nes ARML 2 2 3 B B 5 3 3 3 207 3 
an BAG 0 4 #9 B 14 13 10 4 13 10 3 
NN 3 6 4 6 6 6 13 6 6 76 7 
ADAM 2 i6 2 4 i6 i4 2 6 i4 26 5 
reer BATS 8 9 7 9 9 9 8 9 9 77 8 
ATA 7 7 1 7 7 7 17 7 7 93 0 
SPL 2 2 2 2 2 22 22 22 22 98 22 
COMBI 1 B i 2 B 2 i 3 12 08 2 
Combinations | COMED 2 5 2 5 5 5 2 5 5 36 4 
COMB3 20 19 2 19 19 19 20 9 19 78 9 
COMB4BG | 18 21 18 21 21 21 18 21 21 30 | 21 
COMB4InvW| 5 8 5 8 8 8 5 8 8 63 6 
COMB4 MED | 20 19 20 19 19 19 20 9 19 78 9 
COMB5 7 6 7 17 17 16 7 17 50 7 


Table 3: Ranking of forecasting models in terms of accuracy measures. 


As expected, the overall rank of forecasting models in terms of accuracy measures differs 
from the ranking in terms of model fitting presented in table 2. Now, the best-performing 
forecasting model is GUM, followed by CES and SSARIMA. Among all model combinations, 
only COMB2 and COMB4_InvW lie in a good position, being the fourth and the sixth best 
performing models respectively. Forecasting models SPL and ARML occupy the next-to-last 
and last positions, respectively. 


4. Conclusions 


Results confirm that it does not exist yet a single superior universal model. On the contrary, 
the ranking of different forecasting models is specific to the adopted training set. For example, 
when the time series of interest switches to the employment rates instead of unemployment 
rates, the rank of model performances changes. Secondly, results confirm that performances 
of machine learning and neural network models offer satisfactory alternatives to the traditional 
econometric models like ARIMA or the non-parametric Exponential Smoothing. Finally, the 
results stress the importance of model ensemble techniques as a solution to model uncertainty 
as well as a tool to improve forecast accuracy (Shaub, 2020). 

Overall, the flexibility provided by a rich set of forecasting models, and the possibility to 
combine them, together represent an advantage for decision-makers often constrained to adopt 
solely pure, uncombined, forecasting models. 
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